Learner Corpora and Natural Language Processing
نویسنده
چکیده
Learner corpora collect the language produced by people learning their first or a second language. Natural Language Processing (NLP) deals with the representation and the automatic analysis and generation of human language. The two thus overlap in the representation and automatic analysis of learner language, which constitutes the topic of this chapter. As such, the chapter focuses on one of the two application areas for NLP in the context of language learning (cf. Meurers, 2012), the other one being the use of NLP to process the native language to be learned, for example, to generate exercises, to support retrieval of reading material at the appropriate learner level, or to present texts to learners with visual or other enhancements to support language learning.
منابع مشابه
On the Automatic Analysis of Learner Language. Introduction to the Special Issue
Natural language processing (NLP) has long been used to automatically analyze language produced by language learners, typically aimed at providing individualized feedback and learner modeling in Intelligent Computer-Assisted Language Learning systems (cf. Heift & Schulze 2007). While much interesting research has been reported, it is difficult to determine the state of the art for the automatic...
متن کاملConnecting NLP and Language Learning
I Detmar Meurers (2012). Natural Language Processing and Language Learning. Encyclopedia of Applied Linguistics, edited by Carol A. Chapelle. Blackwell. 4193–4205. I Detmar Meurers (2015). Learner Corpora and Natural Language Processing. The Cambridge Handbook of Learner Corpus Research, edited by Sylviane Granger, Gaëtanelle Gilquin and Fanny Meunier. Cambridge University Press. I Luiz Amaral ...
متن کاملCombining Part of Speech Induction and Morphological Induction
Linguistic information is useful in natural language processing, information retrieval and a multitude of sub-tasks involving language analysis. Two types of linguistic information in all languages are part of speech and morphology. Part of speech information reflects syntactic structure and can assist in tasks such as speech recognition, machine translation and word sense disambiguation. Morph...
متن کاملCreating a manually error-tagged and shallow-parsed learner corpus
The availability of learner corpora, especially those which have been manually error-tagged or shallow-parsed, is still limited. This means that researchers do not have a common development and test set for natural language processing of learner English such as for grammatical error detection. Given this background, we created a novel learner corpus that was manually error-tagged and shallowpar...
متن کاملCorpus based language technology for computer-assisted learning of Nordic languages: Squirrel Progress Report September 2001
1 Background and objectives of the project 1.1 Background The most prevalent forms of information technology in computer-assisted language learning (CALL) applications are email, multimedia and computerised multiple-choice test forms. Until recently, the contribution of language technology (LT) to CALL has been next to nonexistent, although there is now a growing amount of work in the LT commun...
متن کامل